Goto

Collaborating Authors

 Weston


Geometric Learning Dynamics

Vanchurin, Vitaly

arXiv.org Artificial Intelligence

We present a unified geometric framework for modeling learning dynamics in physical, biological, and machine learning systems. The theory reveals three fundamental regimes, each emerging from the power-law relationship $g \propto κ^α$ between the metric tensor $g$ in the space of trainable variables and the noise covariance matrix $κ$. The quantum regime corresponds to $α= 1$ and describes Schrödinger-like dynamics that emerges from a discrete shift symmetry. The efficient learning regime corresponds to $α= \tfrac{1}{2}$ and describes very fast machine learning algorithms. The equilibration regime corresponds to $α= 0$ and describes classical models of biological evolution. We argue that the emergence of the intermediate regime $α= \tfrac{1}{2}$ is a key mechanism underlying the emergence of biological complexity.


Hessian Geometry of Latent Space in Generative Models

Lobashev, Alexander, Guskov, Dmitry, Larchenko, Maria, Tamm, Mikhail

arXiv.org Artificial Intelligence

This paper presents a novel method for analyzing the latent space geometry of generative models, including statistical physics models and diffusion models, by reconstructing the Fisher information metric. The method approximates the posterior distribution of latent variables given generated samples and uses this to learn the log-partition function, which defines the Fisher metric for exponential families. Theoretical convergence guarantees are provided, and the method is validated on the Ising and TASEP models, outperforming existing baselines in reconstructing thermodynamic quantities. Applied to diffusion models, the method reveals a fractal structure of phase transitions in the latent space, characterized by abrupt changes in the Fisher metric. We demonstrate that while geodesic interpolations are approximately linear within individual phases, this linearity breaks down at phase boundaries, where the diffusion model exhibits a divergent Lipschitz constant with respect to the latent space. These findings provide new insights into the complex structure of diffusion model latent spaces and their connection to phenomena like phase transitions. Our source code is available at https://github.com/alobashev/hessian-geometry-of-diffusion-models.


Molecular Learning Dynamics

Gusev, Yaroslav, Vanchurin, Vitaly

arXiv.org Artificial Intelligence

We apply the physics-learning duality to molecular systems by complementing the physical description of interacting particles with a dual learning description, where each particle is modeled as an agent minimizing a loss function. In the traditional physics framework, the equations of motion are derived from the Lagrangian function, while in the learning framework, the same equations emerge from learning dynamics driven by the agent loss function. The loss function depends on scalar quantities that describe invariant properties of all other agents or particles. To demonstrate this approach, we first infer the loss functions of oxygen and hydrogen directly from a dataset generated by the CP2K physics-based simulation of water molecules. We then employ the loss functions to develop a learning-based simulation of water molecules, which achieves comparable accuracy while being significantly more computationally efficient than standard physics-based simulations.


Covariant Gradient Descent

Guskov, Dmitry, Vanchurin, Vitaly

arXiv.org Artificial Intelligence

We present a manifestly covariant formulation of the gradient descent method, ensuring consistency across arbitrary coordinate systems and general curved trainable spaces. The optimization dynamics is defined using a covariant force vector and a covariant metric tensor, both computed from the first and second statistical moments of the gradients. These moments are estimated through time-averaging with an exponential weight function, which preserves linear computational complexity. We show that commonly used optimization methods such as RMSProp, Adam and AdaBelief correspond to special limits of the covariant gradient descent (CGD) and demonstrate how these methods can be further generalized and improved.


Emergent field theories from neural networks

Vanchurin, Vitaly

arXiv.org Artificial Intelligence

We establish a duality relation between Hamiltonian systems and neural network-based learning systems. We show that the Hamilton-Jacobi equations for position and momentum variables correspond to the equations governing the activation dynamics of non-trainable variables and the learning dynamics of trainable variables. The duality is then applied to model various field theories using the activation and learning dynamics of neural networks. For Klein-Gordon fields, the corresponding weight tensor is symmetric, while for Dirac fields, the weight tensor must contain an anti-symmetric tensor factor. The dynamical components of the weight and bias tensors correspond, respectively, to the temporal and spatial components of the gauge field.


Dataset-learning duality and emergent criticality

Kukleva, Ekaterina, Vanchurin, Vitaly

arXiv.org Artificial Intelligence

In artificial neural networks, the activation dynamics of non-trainable variables is strongly coupled to the learning dynamics of trainable variables. During the activation pass, the boundary neurons (e.g., input neurons) are mapped to the bulk neurons (e.g., hidden neurons), and during the learning pass, both bulk and boundary neurons are mapped to changes in trainable variables (e.g., weights and biases). For example, in feed-forward neural networks, forward propagation is the activation pass and backward propagation is the learning pass. We show that a composition of the two maps establishes a duality map between a subspace of non-trainable boundary variables (e.g., dataset) and a tangent subspace of trainable variables (i.e., learning). In general, the dataset-learning duality is a complex non-linear map between high-dimensional spaces, but in a learning equilibrium, the problem can be linearized and reduced to many weakly coupled one-dimensional problems. We use the duality to study the emergence of criticality, or the power-law distributions of fluctuations of the trainable variables. In particular, we show that criticality can emerge in the learning system even from the dataset in a non-critical state, and that the power-law distribution can be modified by changing either the activation function or the loss function.


Autonomous particles

Andrejic, Nikola, Vanchurin, Vitaly

arXiv.org Artificial Intelligence

Consider a reinforcement learning problem where an agent has access to a very large amount of information about the environment, but it can only take very few actions to accomplish its task and to maximize its reward. Evidently, the main problem for the agent is to learn a map from a very high-dimensional space (which represents its environment) to a very low-dimensional space (which represents its actions). The high-to-low dimensional map implies that most of the information about the environment is irrelevant for the actions to be taken, and only a small fraction of information is relevant. In this paper we argue that the relevant information need not be learned by brute force (which is the standard approach), but can be identified from the intrinsic symmetries of the system. We analyze in details a reinforcement learning problem of autonomous driving, where the corresponding symmetry is the Galilean symmetry, and argue that the learning task can be accomplished with very few relevant parameters, or, more precisely, invariants. For a numerical demonstration, we show that the autonomous vehicles (which we call autonomous particles since they describe very primitive vehicles) need only four relevant invariants to learn how to drive very well without colliding with other particles. The simple model can be easily generalized to include different types of particles (e.g. for cars, for pedestrians, for buildings, for road signs, etc.) with different types of relevant invariants describing interactions between them. We also argue that there must exist a field theory description of the learning system where autonomous particles would be described by fermionic degrees of freedom and interactions mediated by the relevant invariants would be described by bosonic degrees of freedom. This suggests that the effectiveness of field theory descriptions of physical systems might be connected to the learning dynamics of some kinds of autonomous particles, supporting the claim that the entire universe is a neural network.


Duke University Health System Joins LeanTaaS to Deliver Keynote

#artificialintelligence

Improving operating room capacity management through data analytics and machine learning will be the breakfast keynote topic of discussion at the upcoming 2020 OR Business Management Conference. Ashley Walsh, senior director of client services at LeanTaaS, Inc., a Silicon Valley software innovator that increases patient access and transforms operational performance for healthcare providers, and Melissa Pressley, management engineer at Duke University Health System (DUHS), will address the audience on Thursday, Jan. 30, at 7:30 a.m. in the Global Ballroom of the Bonaventure Resort & Spa in Weston, Florida. "Improving OR utilization and improving surgeon access to OR time significantly enhances the financial results for hospitals and health systems, increases patient access, and facilitates surgeon recruitment and retention" "DUHS has leveraged EHR data to improve OR access with mobile and web technologies and increase accountability with surgeon-centric metrics and reporting to help our surgeons better understand the "why" behind OR metrics," said Pressley. "I'm looking forward to sharing how DUHS and LeanTaaS have enhanced the patient experience while balancing surgeon needs, among other improvements." DUHS is among several leading health systems in the U.S. that have deployed the LeanTaaS iQueue for Operating Rooms solution to effect data-driven changes to their approach to capacity management.


Mitigating Bias in A.I.: Tips and Tricks

#artificialintelligence

Last year, we saw a steady drumbeat of stories examining bias in artificial intelligence (A.I.) and machine learning. It turns out that folks' preconceptions have a way of creeping into the logic of machines, tainting the results of facial recognition software, recruiting platforms and law enforcement. Even while they're building A.I.-powered systems intended to remove human prejudice from one process or another, technologists may unwittingly contribute to the core problem. Developers and software engineers, after all, have biases of their own, however unconscious. "You would have to have zero interactions with other humans to not have experienced bias, with or without A.I.," said Meg Bear, senior vice president of products for SAP SuccessFactors in South San Francisco, CA.


Using AI to Improve Engagement Surveys, Continuous Feedback

#artificialintelligence

Improving employee engagement is at the top of many human resource leaders' to-do lists. But combing through an ever-growing amount of engagement survey data to extract actionable insights can be overwhelming. So industry vendors have created artificial intelligence (AI) tools designed to automatically analyze survey data to pinpoint themes and characterize the meaning of words or phrases. Tools like natural language processing (NLP) can save HR time and generate more-useful data along the way. SHRM Online spoke with Armen Berjikly, senior director of growth strategy for Ultimate Software in Weston, Fla., during the HR Technology Conference & Exposition for his thoughts on the state of NLP technology today, the pros and cons of using AI to analyze engagement survey data, and the importance of developing a code of ethics for using AI in HR. Prior to working at Ultimate, Berjikly was the founder and CEO of Kanjoya Inc., a workforce intelligence company that pioneered advancements in NLP technology dedicated to understanding human emotion.